Automatic initial and final segmentation in cleft palate speech of Mandarin speakers

نویسندگان

  • Ling He
  • Yin Liu
  • Heng Yin
  • Junpeng Zhang
  • Jing Zhang
  • Jiang Zhang
چکیده

The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with "quasi-unvoiced" or with "quasi-voiced" initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the syllables is 91.24%.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of Receptive and Expressive Vocabulary in 6-18 Month’s-old Children With Cleft Lip and Palate

Objectives: One of the factors predicting language impairments is an early limited lexicon in children. An early limited lexicon can also lead to limited performances in other language areas. This study was aimed to examine receptive and expressive vocabulary in 8-16 month-old children with cleft lip and palate as a predictor of development in other language areas. Materials: The MacArthur-Bat...

متن کامل

Word segmentation in Persian continuous speech using F0 contour

Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...

متن کامل

Speech intelligibility after repair of cleft lip and palate

    Background: Intelligibility refers to understandability of speech; and lack of it can negatively affect children’s overall communication effectiveness. Children with repaired cleft lip and/or cleft palate (CL/P) may experience poor speech intelligibility. This study aimed at evaluating speech intelligibility in children with repaired CL/P who had not been referred to sp...

متن کامل

Evaluation of speech intelligibility for children with cleft lip and palate by means of automatic speech recognition.

OBJECTIVE Cleft lip and palate (CLP) may cause functional limitations even after adequate surgical and non-surgical treatment, speech disorders being one of them. Interindividually, they vary a lot, showing typical articulation specifics such as nasal emission and shift of articulation and therefore a diminished intelligibility. Until now, an objective means to determine and quantify the intell...

متن کامل

Decision Tree Classification Approach for Model Selection in Segmenting Mandarin TTS Corpus

High accuracy automatic segmentation of Mandarin TTS (text to speech) corpus is vital for obtaining high quality syllable’s boundary to corpusbased speech synthesis. Among the existing methods, most studies on automatic segmentation are based upon single model, ignoring the diverse time marks gained by different models in specific Mandarin boundary environment. In this paper, three hidden Marko...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2017